Goto

Collaborating Authors

 Saint Clair County


Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval

Ma, Shengjie, Xu, Chengjin, Jiang, Xuhui, Li, Muzhi, Qu, Huaren, Guo, Jian

arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) has significantly advanced large language models (LLMs) by enabling dynamic information retrieval to mitigate knowledge gaps and hallucinations in generated content. However, these systems often falter with complex reasoning and consistency across diverse queries. In this work, we present Think-on-Graph 2.0, an enhanced RAG framework that aligns questions with the knowledge graph and uses it as a navigational tool, which deepens and refines the RAG paradigm for information collection and integration. The KG-guided navigation fosters deep and long-range associations to uphold logical consistency and optimize the scope of retrieval for precision and interoperability. In conjunction, factual consistency can be better ensured through semantic similarity guided by precise directives. ToG${2.0}$ not only improves the accuracy and reliability of LLMs' responses but also demonstrates the potential of hybrid structured knowledge systems to significantly advance LLM reasoning, aligning it closer to human-like performance. We conducted extensive experiments on four public datasets to demonstrate the advantages of our method compared to the baseline.


On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications

Tan, Chenjiao, Cao, Qian, Li, Yiwei, Zhang, Jielu, Yang, Xiao, Zhao, Huaqin, Wu, Zihao, Liu, Zhengliang, Yang, Hao, Wu, Nemin, Tang, Tao, Ye, Xinyue, Chai, Lilong, Liu, Ninghao, Li, Changying, Mu, Lan, Liu, Tianming, Mai, Gengchen

arXiv.org Artificial Intelligence

The advent of large language models (LLMs) has heightened interest in their potential for multimodal applications that integrate language and vision. This paper explores the capabilities of GPT-4V in the realms of geography, environmental science, agriculture, and urban planning by evaluating its performance across a variety of tasks. Data sources comprise satellite imagery, aerial photos, ground-level images, field images, and public datasets. The model is evaluated on a series of tasks including geo-localization, textual data extraction from maps, remote sensing image classification, visual question answering, crop type identification, disease/pest/weed recognition, chicken behavior analysis, agricultural object counting, urban planning knowledge question answering, and plan generation. The results indicate the potential of GPT-4V in geo-localization, land cover classification, visual question answering, and basic image understanding. However, there are limitations in several tasks requiring fine-grained recognition and precise counting. While zero-shot learning shows promise, performance varies across problem domains and image complexities. The work provides novel insights into GPT-4V's capabilities and limitations for real-world geospatial, environmental, agricultural, and urban planning challenges. Further research should focus on augmenting the model's knowledge and reasoning for specialized domains through expanded training. Overall, the analysis demonstrates foundational multimodal intelligence, highlighting the potential of multimodal foundation models (FMs) to advance interdisciplinary applications at the nexus of computer vision and language.


Scientists discover an ancient Florida village that predates Columbus by hundreds of years

Daily Mail - Science & tech

This week researchers from the University of Florida published findings from an archaeological project that sheds new light on what life was like in North America before Christopher Columbus arrived. Using drones to scan the coastline of northwestern Florida, researchers discovered evidence of a settlement dated between 900 to 1200 AD. They discovered evidence of a settlement that could have supported between 200 and 300 people, who they believe worked to create one beads and decorative ornaments from shells that played an important role in Mississippian culture at the time. The settlement was discovered on Raleigh Island, halfway between Tampa and Tallahassee on Florida's northwestern coast, just outside the Cedar Keys Wildlife Refuge. The drone that discovered the settlement was equipped with a LiDAR system, according to a report by ArsTechnica. LiDAR sends out light rays and then measures the differences in how those rays are reflected back from the environment to create a three-dimensional image of the terrain.